Skip to content

Implement logic orchestration for Git Pull/Push operations #13518

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 26 commits into
base: main
Choose a base branch
from

Conversation

wanling0000
Copy link
Contributor

@wanling0000 wanling0000 commented Jul 9, 2025

Closes #12350

Final PR based on draft: JabRef#714

This PR implements the logic-layer orchestration for Git push and pull operations with semantic merging support. It enables JabRef to semantically merge .bib files when possible, avoiding manual conflict resolution for simple changes.

This is part of ongoing work to support full Git-based collaboration for .bib files. Further tests and UI integration are planned.

Steps to test

This PR implements the first part of Git sync support by enabling automatic semantic merges when there are no conflicts. The scenario is tested via TDD in GitSyncServiceTest, simulating the following steps:

  • Alice creates an initial commit and pushes it to the remote.
  • Bob clones the repo and modifies entry b, then pushes.
  • Alice modifies entry a, then pulls.
    Since they changed different entries, we expect a clean merge without user intervention. The orchestrator is GitSyncService, and helper utilities and value objects are temporarily located in org.jabref.logic.git.util.

Documentation Supplement

🧩 Table: Git-related Class Responsibilities

Class Responsibility
GitSyncService Orchestrates Git operations (pull, merge, push). Handles conflict detection, semantic merging, and committing results.
GitHandler Wrapper around JGit, provides high-level Git operations like fetch, commit, push, and branch management.

🔧 Key Methods in GitSyncService

Method Purpose Description
fetchAndMerge(Path) Pulls and merges changes Runs status checks, fetch, 3-way merge, conflict detection, and auto-commit. Invokes a UI conflict resolution strategy if needed.
performSemanticMerge(...) Performs semantic 3-way merge Compares base/local/remote .bib content semantically. Uses SemanticMerger and conflict resolution strategy.
push(Path) Pushes local changes to remote Based on sync status, decides whether to commit and push. Fails if unresolved conflicts exist.

📦 Package: org.jabref.logic.git.io

This package bridges Git and JabRef's BibTeX data model by handling read/write operations for .bib files at specific Git revisions.

Class Responsibility
GitBibParser Parses .bib content from a Git commit into a BibDatabaseContext using JabRef's parser.
GitFileReader Reads raw .bib text from a specific Git commit.
GitFileWriter Writes BibDatabaseContext content back to a .bib file after merge.
GitRevisionLocator Locates base, local, and remote commits for 3-way merge.
RevisionTriple Data structure that holds the result of locating base, local, and remote revisions.

📦 Package: org.jabref.logic.git.status

Handles detection of Git repository and file tracking/sync status. Used before pull/push to determine the correct operation.

Class Responsibility
GitStatusChecker Static utility to check Git status of a .bib file and return a snapshot.
GitStatusSnapshot Immutable snapshot of a file's Git tracking/conflict/sync state.
SyncStatus Enum representing sync state (e.g., UP_TO_DATE, BEHIND, DIVERGED, etc.).

📦 Package: org.jabref.logic.git.conflicts

Handles semantic conflict detection and resolution strategies during 3-way merge.

Class Responsibility
SemanticConflictDetector Detects entry-level conflicts between base/local/remote BibDatabaseContexts.
ThreeWayEntryConflict Data structure representing a single entry-level 3-way conflict.
GitConflictResolverStrategy Interface for resolving conflicts — supports GUI/CLI strategies.
CliConflictResolverStrategy CLI implementation (planned for JabKit, not yet integrated). GUI is currently prioritized.

Conflict Resolution Flow:

  1. Detect conflicts with SemanticConflictDetector.

  2. If conflicts exist, invoke a GitConflictResolverStrategy:

    • In GUI: use GuiConflictResolverStrategy

    • In future CLI: use CliConflictResolverStrategy

  3. Apply merge plan if resolution succeeds.


📦 Package: org.jabref.logic.git.merge

Executes semantic 3-way merge operations on .bib files, including building merge plans and applying field-level changes.

Class Responsibility
MergePlan Represents semantic changes from base → remote to be applied to local.
SemanticMerger Applies a MergePlan to a local BibDatabaseContext.
GitSemanticMergeExecutor Interface defining a performSemanticMerge() contract.
GitSemanticMergeExecutorImpl Default implementation of the above; invokes detector and merger.
GitMergeUtil Utility class for low-level field and entry merging.

💡 GUI Module Integration Note

  • GUI dependencies are temporarily introduced to reuse the existing MergeEntriesDialog for conflict resolution.

  • This is injected via GitConflictResolverStrategy to keep the dependency direction correct (logic → gui).

  • A major UI refactor is WIP.

Class Responsibility
GuiConflictResolverStrategy GUI-based conflict resolution strategy. Reuses MergeEntriesDialog to let users manually resolve semantic conflicts in BibEntrys.

Mandatory checks

  • I own the copyright of the code submitted and I license it under the MIT license
  • [/] Change in CHANGELOG.md described in a way that is understandable for the average user (if change is visible to the user)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • [/] Screenshots added in PR description (if change is visible to the user)
  • Checked developer's documentation: Is the information available and up to date? If not, I outlined it in this pull request.
  • Checked documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request to the documentation repository.

@@ -27,7 +27,7 @@ public interface LibraryTabContainer {
* Closes a designated libraryTab
*
* @param tab to be closed.
* @return true if closing the tab was successful
* @return true if closing the tab was isSuccessful
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IntelliJ probably replaced this

Suggested change
* @return true if closing the tab was isSuccessful
* @return true if closing the tab was successful

@wanling0000 wanling0000 force-pushed the clean-gsoc-git-support-init branch from 8302ebc to 32c30a0 Compare July 14, 2025 00:54
Comment on lines 77 to 78
// Status is BEHIND or DIVERGED
try (Git git = gitHandler.open()) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is trivial and can be derived from the code context. The status check above clearly indicates this is for BEHIND or DIVERGED cases.

Comment on lines +145 to +146
// Debug hint: Show the created git graph on the command line
// git log --graph --oneline --decorate --all --reflog
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented code should be removed as per project guidelines. Git history should be used instead of keeping debug hints in comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will remove in the end

import static org.mockito.Mockito.when;

class GitSyncServiceTest {
private Git git;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe name this aliceGit to be consistent with @BeforeEach

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it, I think private Git git was from an earlier test setup. Later on, aliceGit was introduced and assigned to it, which made the git field redundant. Now I removed it and consistently use aliceGit, bobGit, and remoteGit to clearly distinguish the roles of each repository

@@ -113,25 +114,27 @@ void aliceBobSimple(@TempDir Path tempDir) throws Exception {

// Alice clone remote -> local repository
aliceDir = tempDir.resolve("alice");
aliceGit = Git.cloneRepository()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, this was OK.

Maybe, you need to close aliceGit at the end of the method.

Copy link

trag-bot bot commented Jul 17, 2025

@trag-bot didn't find any issues in the code! ✅✨

@wanling0000 wanling0000 marked this pull request as ready for review July 17, 2025 14:11
BibEntry local = conflict.local();
BibEntry remote = conflict.remote();

// Create Dialog + Set Title + Configure Diff Highlighting
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is trivial and only describes what the code does without adding any additional information or reasoning. The operations are self-evident from the code that follows.

} catch (JabRefException e) {
dialogService.showErrorDialogAndWait("Git Pull Failed", e);
// TODO: error handling
} catch (GitAPIException e) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrapping exceptions in RuntimeException loses the specific error context and makes error handling less effective. Should use more specific exception handling.

}

public MergeResult pull() throws IOException, GitAPIException, JabRefException {
MergeResult result = syncService.fetchAndMerge(bibFilePath);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operation might be long-running and should use BackgroundTask instead of direct execution to prevent UI freezing during Git operations.


@Override
public Optional<BibDatabaseContext> resolveConflicts(List<ThreeWayEntryConflict> conflicts, BibDatabaseContext remote) {
List<BibEntry> resolved = new ArrayList<>();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using ArrayList constructor directly instead of modern Java collections. Consider using List.of() for initial empty list creation to follow modern Java practices.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But mutable list is required here, cannot use List.of()

BibEntry baseEntry = baseMap.get(key);

if (baseEntry == null) {
// New entry (not in base)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment does not add new information and simply restates what is obvious from the code. Such comments should be removed.

Comment on lines +6 to +8
BibEntry base,
BibEntry local,
BibEntry remote
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The record fields lack validation to ensure none of the BibEntry parameters can be null. This could lead to NullPointerExceptions during conflict resolution operations.

import org.jabref.model.util.DummyFileUpdateMonitor;

public class GitBibParser {
public static BibDatabaseContext parseBibFromGit(String bibContent, ImportFormatPreferences importFormatPreferences) throws JabRefException {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method should return Optional instead of potentially null BibDatabaseContext to follow modern Java practices for null safety.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result.getDatabaseContext() ensures that null values are never returned

import org.jspecify.annotations.NonNull;

public class GitFileReader {
// Unit test is in the GitSyncServiceTest
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment provides no additional information beyond what can be inferred from the code and test location. It should be removed as it doesn't add value or reasoning.


public class GitFileReader {
// Unit test is in the GitSyncServiceTest
public static String readFileFromCommit(Git git, RevCommit commit, @NonNull Path relativePath) throws JabRefException {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method returns String which could be null. Modern Java practices suggest using Optional for potentially null returns to make the API more explicit.

SelfContainedSaveConfiguration saveConfiguration = new SelfContainedSaveConfiguration();
Charset encoding = bibDatabaseContext.getMetaData().getEncoding().orElse(StandardCharsets.UTF_8);

synchronized (bibDatabaseContext) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Synchronizing on a parameter object is dangerous as it can lead to deadlocks if the same object is used elsewhere. Consider using a private final lock object or more fine-grained synchronization.

Comment on lines +27 to +30
// assumes the remote branch is 'origin/main'
ObjectId headId = repo.resolve(HEAD);
// and uses the default remote tracking reference
// does not support multiple remotes or custom remote branch names so far
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments are trivial and only restate what is obvious from the code itself. They don't provide additional information about the reasoning or implementation details.

* @param local the current local branch tip
* @param remote the tip of the remote tracking branch (typically origin/main)
*/
public record RevisionTriple(RevCommit base, RevCommit local, RevCommit remote) { }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameters should be validated for null values since they represent critical commit references. The record should include validation in a compact constructor to prevent null values.

Comment on lines +21 to +22
// 1. make a copy of the remote database
BibDatabase newDatabase = new BibDatabase();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment restates what's obvious from the code without providing additional insight or reasoning behind the operation.

}

@Override
public MergeResult merge(BibDatabaseContext base, BibDatabaseContext local, BibDatabaseContext remote, Path bibFilePath) throws IOException, IOException {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method parameters should be validated for null values before use. Currently there are no null checks which could lead to NullPointerException during execution.

applyPatchToDatabase(local, plan.fieldPatches());

for (BibEntry newEntry : plan.newEntries()) {
BibEntry clone = (BibEntry) newEntry.clone();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Object.clone() is discouraged as per Effective Java. Should implement a proper copy constructor or factory method instead to ensure proper object copying.

import org.jabref.logic.bibtex.comparator.BibEntryDiff;

public record MergeResult(boolean isSuccessful, List<BibEntryDiff> conflicts) {
private static boolean SUCCESS = true;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SUCCESS constant should be declared as final since it's a static field that represents an immutable state. This follows Java best practices for constant values.

Comment on lines +4 to +10
UP_TO_DATE, // Local and remote are in sync
BEHIND, // Local is behind remote, pull needed
AHEAD, // Local is ahead of remote, push needed
DIVERGED, // Both local and remote have new commits; merge required
CONFLICT, // Merge conflict detected
UNTRACKED, // Not under Git control
UNKNOWN // Status couldn't be determined
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments for enum values are trivial and simply restate what can be derived from the enum names. These comments don't provide additional information about implementation details or reasoning.

@wanling0000 wanling0000 marked this pull request as draft July 17, 2025 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add (simple) git support
3 participants